Search CORE

58 research outputs found

Automated Protein Structure Classification: A Survey

Author: Hassanzadeh Oktie
Publication venue
Publication date: 01/01/2008
Field of study

Classification of proteins based on their structure provides a valuable resource for studying protein structure, function and evolutionary relationships. With the rapidly increasing number of known protein structures, manual and semi-automatic classification is becoming ever more difficult and prohibitively slow. Therefore, there is a growing need for automated, accurate and efficient classification methods to generate classification databases or increase the speed and accuracy of semi-automatic techniques. Recognizing this need, several automated classification methods have been developed. In this survey, we overview recent developments in this area. We classify different methods based on their characteristics and compare their methodology, accuracy and efficiency. We then present a few open problems and explain future directions.Comment: 14 pages, Technical Report CSRG-589, University of Toront

arXiv.org e-Print Archive

CiteSeerX

Event Prediction using Case-Based Reasoning over Knowledge Graphs

Author: Bhattacharjya Debarun
Hassanzadeh Oktie
Shirai Sola
Publication venue
Publication date: 21/09/2023
Field of study

Applying link prediction (LP) methods over knowledge graphs (KG) for tasks such as causal event prediction presents an exciting opportunity. However, typical LP models are ill-suited for this task as they are incapable of performing inductive link prediction for new, unseen event entities and they require retraining as knowledge is added or changed in the underlying KG. We introduce a case-based reasoning model, EvCBR, to predict properties about new consequent events based on similar cause-effect events present in the KG. EvCBR uses statistical measures to identify similar events and performs path-based predictions, requiring no training step. To generalize our methods beyond the domain of event prediction, we frame our task as a 2-hop LP task, where the first hop is a causal relation connecting a cause event to a new effect event and the second hop is a property about the new event which we wish to predict. The effectiveness of our method is demonstrated using a novel dataset of newsworthy events with causal relations curated from Wikidata, where EvCBR outperforms baselines including translational-distance-based, GNN-based, and rule-based LP models.Comment: published at WWW '23: Proceedings of the ACM Web Conference 2023. Code base: https://github.com/solashirai/WWW-EvCB

arXiv.org e-Print Archive

Improving Neural Ranking Models with Traditional IR Methods

Author: Gittens Alex
Hassanzadeh Oktie
Ni Jian
Saha Anik
Srinivas Kavitha
Yener Bulent
Publication venue
Publication date: 29/08/2023
Field of study

Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions. Nevertheless, they are computationally expensive to create, and require a great deal of labeled data for specialized corpora. In this paper, we explore a low resource alternative which is a bag-of-embedding model for document retrieval and find that it is competitive with large transformer models fine tuned on information retrieval tasks. Our results show that a simple combination of TF-IDF, a traditional keyword matching method, with a shallow embedding model provides a low cost path to compete well with the performance of complex neural ranking models on 3 datasets. Furthermore, adding TF-IDF measures improves the performance of large-scale fine tuned models on these tasks.Comment: Short paper, 4 page

arXiv.org e-Print Archive

A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction

Author: Gittens Alex
Hassanzadeh Oktie
Ni Jian
Saha Anik
Srinivas Kavitha
Yener Bulent
Publication venue
Publication date: 07/08/2023
Field of study

Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation. Although this task is important for language understanding and knowledge discovery, recent works in this domain have largely focused on binary classification of a text segment as causal or non-causal. In this regard, we perform a thorough analysis of three sequence tagging models for causal knowledge extraction and compare it with a span based approach to causality extraction. Our experiments show that embeddings from pre-trained language models (e.g. BERT) provide a significant performance boost on this task compared to previous state-of-the-art models with complex architectures. We observe that span based models perform better than simple sequence tagging models based on BERT across all 4 data sets from diverse domains with different types of cause-effect phrases

arXiv.org e-Print Archive

OM-2017: Proceedings of the Twelfth International Workshop on Ontology Matching

Author: Cheatham Michelle
Euzenat Jérôme
Hassanzadeh Oktie
Jiménez-Ruiz Ernesto
Shvaiko Pavel
Publication venue: No commercial editor.
Publication date: 01/01/2017
Field of study

shvaiko2017aInternational audienceOntology matching is a key interoperability enabler for the semantic web, as well as auseful tactic in some classical data integration tasks dealing with the semantic heterogeneityproblem. It takes ontologies as input and determines as output an alignment,that is, a set of correspondences between the semantically related entities of those ontologies.These correspondences can be used for various tasks, such as ontology merging,data translation, query answering or navigation on the web of data. Thus, matchingontologies enables the knowledge and data expressed with the matched ontologies tointeroperate

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Ontology Matching: OM-2018: Proceedings of the ISWC Workshop

Author: Cheatham Michelle
Euzenat Jérôme
Hassanzadeh Oktie
Jiménez-Ruiz Ernesto
Shvaiko Pavel
Publication venue: No commercial editor.
Publication date: 01/01/2018
Field of study

International audienceno abstrac

INRIA a CCSD electronic archive server

LakeBench: Benchmarks for Data Discovery over Data Lakes

Author: Abdelaziz Ibrahim
Chaudhury Subhajit
Dolby Julian
Hassanzadeh Oktie
Khatiwada Aamod
Kokel Harsha
Pedapati Tejaswini
Samulowitz Horst
Srinivas Kavitha
Publication venue
Publication date: 09/07/2023
Field of study

Within enterprises, there is a growing need to intelligently navigate data lakes, specifically focusing on data discovery. Of particular importance to enterprises is the ability to find related tables in data repositories. These tables can be unionable, joinable, or subsets of each other. There is a dearth of benchmarks for these tasks in the public domain, with related work targeting private datasets. In LakeBench, we develop multiple benchmarks for these tasks by using the tables that are drawn from a diverse set of data sources such as government data from CKAN, Socrata, and the European Central Bank. We compare the performance of 4 publicly available tabular foundational models on these tasks. None of the existing models had been trained on the data discovery tasks that we developed for this benchmark; not surprisingly, their performance shows significant room for improvement. The results suggest that the establishment of such benchmarks may be useful to the community to build tabular models usable for data discovery in data lakes

arXiv.org e-Print Archive

Proceedings of the 15th ISWC workshop on Ontology Matching (OM 2020)

Author: Euzenat Jérôme
Hassanzadeh Oktie
Jiménez-Ruiz Ernesto
Shvaiko Pavel
Trojahn dos Santos Cassia
Publication venue: CEUR.org
Publication date: 01/01/2020
Field of study

15th International Workshop on Ontology Matching co-located with the 19th International Semantic Web Conference (ISWC 2020)International audienc

INRIA a CCSD electronic archive server

Proceedings of The Tenth International Workshop on Ontology Matching (OM-2015)

Author: Cheatham Michelle
Euzenat Jérôme
Hassanzadeh Oktie
Ichise Ryutaro
Jiménez-Ruiz Ernesto
Shvaiko Pavel
Publication venue: No commercial editor.
Publication date: 01/01/2016
Field of study

shvaiko2016aInternational audienceno abstrac

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server